Goto

Collaborating Authors

 Machakos


RideKE: Leveraging Low-Resource, User-Generated Twitter Content for Sentiment and Emotion Detection in Kenyan Code-Switched Dataset

Etori, Naome A., Gini, Maria L.

arXiv.org Artificial Intelligence

Social media has become a crucial open-access platform for individuals to express opinions and share experiences. However, leveraging low-resource language data from Twitter is challenging due to scarce, poor-quality content and the major variations in language use, such as slang and code-switching. Identifying tweets in these languages can be difficult as Twitter primarily supports high-resource languages. We analyze Kenyan code-switched data and evaluate four state-of-the-art (SOTA) transformer-based pretrained models for sentiment and emotion classification, using supervised and semi-supervised methods. We detail the methodology behind data collection and annotation, and the challenges encountered during the data curation phase. Our results show that XLM-R outperforms other models; for sentiment analysis, XLM-R supervised model achieves the highest accuracy (69.2\%) and F1 score (66.1\%), XLM-R semi-supervised (67.2\% accuracy, 64.1\% F1 score). In emotion analysis, DistilBERT supervised leads in accuracy (59.8\%) and F1 score (31\%), mBERT semi-supervised (accuracy (59\% and F1 score 26.5\%). AfriBERTa models show the lowest accuracy and F1 scores. All models tend to predict neutral sentiment, with Afri-BERT showing the highest bias and unique sensitivity to empathy emotion. https://github.com/NEtori21/Ride_hailing


High tech, high yields? The Kenyan farmers deploying AI to increase productivity

The Guardian

Sammy Selim strode through the dense, shiny green bushes on the slopes of his coffee farm in Sorwot village in Kericho, Kenya, accompanied by a younger farmer called Kennedy Kirui. They paused at each corner to input the farm's coordinates into a WhatsApp conversation. The conversation was with Virtual Agronomist, a tool that uses artificial intelligence to provide fertiliser application advice using chat prompts. The chatbot asked some further questions before producing a report saying that Selim should target a yield of 7.9 tonnes and use three types of fertiliser in specific quantities to achieve that goal. "My God!" Selim said upon receipt of the report.


Harnessing Artificial Intelligence for Sustainable Agricultural Development in Africa: Opportunities, Challenges, and Impact

Gikunda, Kinyua

arXiv.org Artificial Intelligence

This paper explores the transformative potential of artificial intelligence (AI) in the context of sustainable agricultural development across diverse regions in Africa. Delving into opportunities, challenges, and impact, the study navigates through the dynamic landscape of AI applications in agriculture. Opportunities such as precision farming, crop monitoring, and climate-resilient practices are examined, alongside challenges related to technological infrastructure, data accessibility, and skill gaps. The article analyzes the impact of AI on smallholder farmers, supply chains, and inclusive growth. Ethical considerations and policy implications are also discussed, offering insights into responsible AI integration. By providing a nuanced understanding, this paper contributes to the ongoing discourse on leveraging AI for fostering sustainability in African agriculture.


BART-SIMP: a novel framework for flexible spatial covariate modeling and prediction using Bayesian additive regression trees

Jiang, Alex Ziyu, Wakefield, Jon

arXiv.org Machine Learning

Prediction is a classic challenge in spatial statistics and the inclusion of spatial covariates can greatly improve predictive performance when incorporated into a model with latent spatial effects. It is desirable to develop flexible regression models that allow for nonlinearities and interactions in the covariate structure. Machine learning models have been suggested in the spatial context, allowing for spatial dependence in the residuals, but fail to provide reliable uncertainty estimates. In this paper, we investigate a novel combination of a Gaussian process spatial model and a Bayesian Additive Regression Tree (BART) model. The computational burden of the approach is reduced by combining Markov chain Monte Carlo (MCMC) with the Integrated Nested Laplace Approximation (INLA) technique. We study the performance of the method via simulations and use the model to predict anthropometric responses, collected via household cluster samples in Kenya.


Bill Gates claims 'magic seeds' engineered to adapt to climate change will help solve world hunger

Daily Mail - Science & tech

Bill Gates has called for greater investment in engineered crops that can adapt to climate change and resist agricultural pests, in an effort to solve world hunger. In the latest annual Goalkeepers Report from the Bill & Melinda Gates Foundation, Gates says the global hunger crisis is so immense that food aid cannot fully address the problem. What's also needed, he argues, are innovations in farming technology that can help to reverse the crisis. Gates points in particular to a breakthrough he calls'magic seeds' - including maize that has been bred to be more resistant to hotter, drier climates, and rice that requires three fewer weeks in the field. These innovations will allow agricultural productivity to increase despite the changing climate, he argues.


No Language Left Behind: Scaling Human-Centered Machine Translation

NLLB Team, null, Costa-jussà, Marta R., Cross, James, Çelebi, Onur, Elbayad, Maha, Heafield, Kenneth, Heffernan, Kevin, Kalbassi, Elahe, Lam, Janice, Licht, Daniel, Maillard, Jean, Sun, Anna, Wang, Skyler, Wenzek, Guillaume, Youngblood, Al, Akula, Bapi, Barrault, Loic, Gonzalez, Gabriel Mejia, Hansanti, Prangthip, Hoffman, John, Jarrett, Semarley, Sadagopan, Kaushik Ram, Rowe, Dirk, Spruit, Shannon, Tran, Chau, Andrews, Pierre, Ayan, Necip Fazil, Bhosale, Shruti, Edunov, Sergey, Fan, Angela, Gao, Cynthia, Goswami, Vedanuj, Guzmán, Francisco, Koehn, Philipp, Mourachko, Alexandre, Ropers, Christophe, Saleem, Safiyyah, Schwenk, Holger, Wang, Jeff

arXiv.org Artificial Intelligence

Driven by the goal of eradicating language barriers on a global scale, machine translation has solidified itself as a key focus of artificial intelligence research today. However, such efforts have coalesced around a small subset of languages, leaving behind the vast majority of mostly low-resource languages. What does it take to break the 200 language barrier while ensuring safe, high quality results, all while keeping ethical considerations in mind? In No Language Left Behind, we took on this challenge by first contextualizing the need for low-resource language translation support through exploratory interviews with native speakers. Then, we created datasets and models aimed at narrowing the performance gap between low and high-resource languages. More specifically, we developed a conditional compute model based on Sparsely Gated Mixture of Experts that is trained on data obtained with novel and effective data mining techniques tailored for low-resource languages. We propose multiple architectural and training improvements to counteract overfitting while training on thousands of tasks. Critically, we evaluated the performance of over 40,000 different translation directions using a human-translated benchmark, Flores-200, and combined human evaluation with a novel toxicity benchmark covering all languages in Flores-200 to assess translation safety. Our model achieves an improvement of 44% BLEU relative to the previous state-of-the-art, laying important groundwork towards realizing a universal translation system.